Overview

Dataset statistics

Number of variables13
Number of observations14116
Missing cells0
Missing cells (%)0.0%
Duplicate rows0
Duplicate rows (%)0.0%
Total size in memory1.4 MiB
Average record size in memory104.0 B

Variable types

Numeric8
Categorical5

Alerts

df_index is highly correlated with employee_idHigh correlation
employee_id is highly correlated with df_indexHigh correlation
age is highly correlated with n_projectsHigh correlation
n_projects is highly correlated with ageHigh correlation
df_index is highly correlated with employee_idHigh correlation
employee_id is highly correlated with df_indexHigh correlation
age is highly correlated with n_projectsHigh correlation
n_projects is highly correlated with ageHigh correlation
df_index is highly correlated with employee_idHigh correlation
employee_id is highly correlated with df_indexHigh correlation
age is highly correlated with n_projectsHigh correlation
n_projects is highly correlated with ageHigh correlation
df_index is highly correlated with employee_idHigh correlation
employee_id is highly correlated with df_index and 1 other fieldsHigh correlation
age is highly correlated with marital_status and 1 other fieldsHigh correlation
marital_status is highly correlated with age and 1 other fieldsHigh correlation
avg_monthly_hrs is highly correlated with n_projects and 2 other fieldsHigh correlation
n_projects is highly correlated with age and 4 other fieldsHigh correlation
satisfaction is highly correlated with avg_monthly_hrs and 2 other fieldsHigh correlation
status is highly correlated with employee_id and 3 other fieldsHigh correlation
df_index is uniformly distributed Uniform
df_index has unique values Unique
employee_id has unique values Unique

Reproduction

Analysis started2022-04-25 14:12:35.601132
Analysis finished2022-04-25 14:13:02.900641
Duration27.3 seconds
Software versionpandas-profiling v3.1.0
Download configurationconfig.json

Variables

df_index
Real number (ℝ≥0)

HIGH CORRELATION
HIGH CORRELATION
HIGH CORRELATION
HIGH CORRELATION
UNIFORM
UNIQUE

Distinct14116
Distinct (%)100.0%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean7071.104704
Minimum0
Maximum14144
Zeros1
Zeros (%)< 0.1%
Negative0
Negative (%)0.0%
Memory size110.4 KiB

Quantile statistics

Minimum0
5-th percentile705.75
Q13532.75
median7070.5
Q310610.25
95-th percentile13438.25
Maximum14144
Range14144
Interquartile range (IQR)7077.5

Descriptive statistics

Standard deviation4084.958507
Coefficient of variation (CV)0.5776973582
Kurtosis-1.200746217
Mean7071.104704
Median Absolute Deviation (MAD)3539
Skewness0.0003525578936
Sum99815714
Variance16686886
MonotonicityStrictly increasing
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
01
 
< 0.1%
94331
 
< 0.1%
94221
 
< 0.1%
94231
 
< 0.1%
94241
 
< 0.1%
94251
 
< 0.1%
94261
 
< 0.1%
94271
 
< 0.1%
94281
 
< 0.1%
94291
 
< 0.1%
Other values (14106)14106
99.9%
ValueCountFrequency (%)
01
< 0.1%
11
< 0.1%
21
< 0.1%
31
< 0.1%
41
< 0.1%
51
< 0.1%
61
< 0.1%
71
< 0.1%
81
< 0.1%
91
< 0.1%
ValueCountFrequency (%)
141441
< 0.1%
141431
< 0.1%
141421
< 0.1%
141411
< 0.1%
141401
< 0.1%
141391
< 0.1%
141381
< 0.1%
141371
< 0.1%
141361
< 0.1%
141351
< 0.1%

employee_id
Real number (ℝ≥0)

HIGH CORRELATION
HIGH CORRELATION
HIGH CORRELATION
HIGH CORRELATION
UNIQUE

Distinct14116
Distinct (%)100.0%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean112120.6578
Minimum100101
Maximum148988
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size110.4 KiB

Quantile statistics

Minimum100101
5-th percentile101224.75
Q1105773.5
median111293.5
Q3116655.25
95-th percentile128001
Maximum148988
Range48887
Interquartile range (IQR)10881.75

Descriptive statistics

Standard deviation8497.639403
Coefficient of variation (CV)0.07579013156
Kurtosis2.759148176
Mean112120.6578
Median Absolute Deviation (MAD)5445.5
Skewness1.304670041
Sum1582695205
Variance72209875.42
MonotonicityStrictly increasing
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
1001011
 
< 0.1%
1148631
 
< 0.1%
1148461
 
< 0.1%
1148471
 
< 0.1%
1148491
 
< 0.1%
1148511
 
< 0.1%
1148521
 
< 0.1%
1148531
 
< 0.1%
1148551
 
< 0.1%
1148561
 
< 0.1%
Other values (14106)14106
99.9%
ValueCountFrequency (%)
1001011
< 0.1%
1001021
< 0.1%
1001031
< 0.1%
1001051
< 0.1%
1001061
< 0.1%
1001071
< 0.1%
1001081
< 0.1%
1001091
< 0.1%
1001101
< 0.1%
1001111
< 0.1%
ValueCountFrequency (%)
1489881
< 0.1%
1489471
< 0.1%
1489161
< 0.1%
1488791
< 0.1%
1488771
< 0.1%
1488421
< 0.1%
1487681
< 0.1%
1487371
< 0.1%
1487191
< 0.1%
1486401
< 0.1%

age
Real number (ℝ≥0)

HIGH CORRELATION
HIGH CORRELATION
HIGH CORRELATION
HIGH CORRELATION

Distinct36
Distinct (%)0.3%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean32.89600453
Minimum22
Maximum57
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size110.4 KiB

Quantile statistics

Minimum22
5-th percentile22
Q124
median29
Q341
95-th percentile52
Maximum57
Range35
Interquartile range (IQR)17

Descriptive statistics

Standard deviation9.975000045
Coefficient of variation (CV)0.3032283156
Kurtosis-0.867621121
Mean32.89600453
Median Absolute Deviation (MAD)6
Skewness0.7008830699
Sum464360
Variance99.50062591
MonotonicityNot monotonic
Histogram with fixed size bins (bins=36)
ValueCountFrequency (%)
241308
 
9.3%
251246
 
8.8%
231196
 
8.5%
221166
 
8.3%
27662
 
4.7%
29660
 
4.7%
28647
 
4.6%
26626
 
4.4%
42303
 
2.1%
37284
 
2.0%
Other values (26)6018
42.6%
ValueCountFrequency (%)
221166
8.3%
231196
8.5%
241308
9.3%
251246
8.8%
26626
4.4%
27662
4.7%
28647
4.6%
29660
4.7%
30275
 
1.9%
31225
 
1.6%
ValueCountFrequency (%)
5734
 
0.2%
5622
 
0.2%
5538
 
0.3%
54226
1.6%
53235
1.7%
52252
1.8%
51227
1.6%
50233
1.7%
49243
1.7%
48272
1.9%

gender
Categorical

Distinct2
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size110.4 KiB
Male
9287 
Female
4829 

Length

Max length6
Median length4
Mean length4.684188155
Min length4

Characters and Unicode

Total characters0
Distinct characters0
Distinct categories0 ?
Distinct scripts0 ?
Distinct blocks0 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowMale
2nd rowFemale
3rd rowMale
4th rowFemale
5th rowMale

Common Values

ValueCountFrequency (%)
Male9287
65.8%
Female4829
34.2%

Length

Histogram of lengths of the category

Pie chart

ValueCountFrequency (%)
male9287
65.8%
female4829
34.2%

Most occurring characters

ValueCountFrequency (%)
No values found.

Most occurring categories

ValueCountFrequency (%)
No values found.

Most frequent character per category

Most occurring scripts

ValueCountFrequency (%)
No values found.

Most frequent character per script

Most occurring blocks

ValueCountFrequency (%)
No values found.

Most frequent character per block

marital_status
Categorical

HIGH CORRELATION

Distinct2
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size110.4 KiB
Unmarried
7211 
Married
6905 

Length

Max length9
Median length9
Mean length8.021677529
Min length7

Characters and Unicode

Total characters0
Distinct characters0
Distinct categories0 ?
Distinct scripts0 ?
Distinct blocks0 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowUnmarried
2nd rowUnmarried
3rd rowUnmarried
4th rowUnmarried
5th rowUnmarried

Common Values

ValueCountFrequency (%)
Unmarried7211
51.1%
Married6905
48.9%

Length

Histogram of lengths of the category

Pie chart

ValueCountFrequency (%)
unmarried7211
51.1%
married6905
48.9%

Most occurring characters

ValueCountFrequency (%)
No values found.

Most occurring categories

ValueCountFrequency (%)
No values found.

Most frequent character per category

Most occurring scripts

ValueCountFrequency (%)
No values found.

Most frequent character per script

Most occurring blocks

ValueCountFrequency (%)
No values found.

Most frequent character per block

avg_monthly_hrs
Real number (ℝ≥0)

HIGH CORRELATION

Distinct249
Distinct (%)1.8%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean199.9926325
Minimum49
Maximum310
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size110.4 KiB

Quantile statistics

Minimum49
5-th percentile128
Q1155
median199
Q3245
95-th percentile275
Maximum310
Range261
Interquartile range (IQR)90

Descriptive statistics

Standard deviation50.82695196
Coefficient of variation (CV)0.2541441219
Kurtosis-1.044324553
Mean199.9926325
Median Absolute Deviation (MAD)45
Skewness0.01643063443
Sum2823096
Variance2583.379046
MonotonicityNot monotonic
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
135143
 
1.0%
156141
 
1.0%
151140
 
1.0%
149139
 
1.0%
145125
 
0.9%
143124
 
0.9%
160123
 
0.9%
260118
 
0.8%
154118
 
0.8%
148118
 
0.8%
Other values (239)12827
90.9%
ValueCountFrequency (%)
493
< 0.1%
521
 
< 0.1%
542
< 0.1%
551
 
< 0.1%
561
 
< 0.1%
601
 
< 0.1%
631
 
< 0.1%
651
 
< 0.1%
662
< 0.1%
674
< 0.1%
ValueCountFrequency (%)
31018
0.1%
30915
0.1%
30819
0.1%
30714
0.1%
30617
0.1%
30518
0.1%
30417
0.1%
3036
 
< 0.1%
3028
 
0.1%
30121
0.1%

department
Categorical

Distinct12
Distinct (%)0.1%
Missing0
Missing (%)0.0%
Memory size110.4 KiB
D00-SS
4601 
D00-ENG
2573 
D00-SP
2108 
D00D00-IT
1152 
D00-PD
853 
Other values (7)
2829 

Length

Max length9
Median length6
Mean length6.427103995
Min length6

Characters and Unicode

Total characters0
Distinct characters0
Distinct categories0 ?
Distinct scripts0 ?
Distinct blocks0 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowD00-SS
2nd rowD00-MN
3rd rowD00-ENG
4th rowD00-MT
5th rowD00-ENG

Common Values

ValueCountFrequency (%)
D00-SS4601
32.6%
D00-ENG2573
18.2%
D00-SP2108
14.9%
D00D00-IT1152
 
8.2%
D00-PD853
 
6.0%
D00-MT812
 
5.8%
D00-FN722
 
5.1%
D00-MN590
 
4.2%
D00-IT207
 
1.5%
D00-AD175
 
1.2%
Other values (2)323
 
2.3%

Length

Histogram of lengths of the category
ValueCountFrequency (%)
d00-ss4601
32.6%
d00-eng2573
18.2%
d00-sp2108
14.9%
d00d00-it1152
 
8.2%
d00-pd853
 
6.0%
d00-mt812
 
5.8%
d00-fn722
 
5.1%
d00-mn590
 
4.2%
d00-it207
 
1.5%
d00-ad175
 
1.2%
Other values (2)323
 
2.3%

Most occurring characters

ValueCountFrequency (%)
No values found.

Most occurring categories

ValueCountFrequency (%)
No values found.

Most frequent character per category

Most occurring scripts

ValueCountFrequency (%)
No values found.

Most frequent character per script

Most occurring blocks

ValueCountFrequency (%)
No values found.

Most frequent character per block

last_evaluation
Real number (ℝ≥0)

Distinct12185
Distinct (%)86.3%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean0.7183221625
Minimum0.316175
Maximum1
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size110.4 KiB

Quantile statistics

Minimum0.316175
5-th percentile0.45900175
Q10.5795165
median0.7183221625
Q30.85685375
95-th percentile0.976118
Maximum1
Range0.683825
Interquartile range (IQR)0.27733725

Descriptive statistics

Standard deviation0.1636994965
Coefficient of variation (CV)0.2278914741
Kurtosis-0.9861755436
Mean0.7183221625
Median Absolute Deviation (MAD)0.138693
Skewness-0.06847563778
Sum10139.83565
Variance0.02679752515
MonotonicityNot monotonic
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
0.71832216251487
 
10.5%
1356
 
2.5%
0.8962463
 
< 0.1%
0.666312
 
< 0.1%
0.8386462
 
< 0.1%
0.9748142
 
< 0.1%
0.8899852
 
< 0.1%
0.7448342
 
< 0.1%
0.9555792
 
< 0.1%
0.5741662
 
< 0.1%
Other values (12175)12256
86.8%
ValueCountFrequency (%)
0.3161751
< 0.1%
0.3172791
< 0.1%
0.3209531
< 0.1%
0.3228281
< 0.1%
0.3242391
< 0.1%
0.3258851
< 0.1%
0.3284171
< 0.1%
0.3298131
< 0.1%
0.331321
< 0.1%
0.3315451
< 0.1%
ValueCountFrequency (%)
1356
2.5%
0.9998081
 
< 0.1%
0.999391
 
< 0.1%
0.9993651
 
< 0.1%
0.9992591
 
< 0.1%
0.999211
 
< 0.1%
0.999151
 
< 0.1%
0.9991451
 
< 0.1%
0.9991131
 
< 0.1%
0.9990621
 
< 0.1%

n_projects
Real number (ℝ≥0)

HIGH CORRELATION
HIGH CORRELATION
HIGH CORRELATION
HIGH CORRELATION

Distinct7
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean3.777769906
Minimum1
Maximum7
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size110.4 KiB

Quantile statistics

Minimum1
5-th percentile2
Q13
median4
Q35
95-th percentile6
Maximum7
Range6
Interquartile range (IQR)2

Descriptive statistics

Standard deviation1.249693245
Coefficient of variation (CV)0.3308018422
Kurtosis-0.4814501348
Mean3.777769906
Median Absolute Deviation (MAD)1
Skewness0.3152883573
Sum53327
Variance1.561733206
MonotonicityNot monotonic
Histogram with fixed size bins (bins=7)
ValueCountFrequency (%)
44044
28.6%
33788
26.8%
52566
18.2%
22322
16.4%
61093
 
7.7%
7242
 
1.7%
161
 
0.4%
ValueCountFrequency (%)
161
 
0.4%
22322
16.4%
33788
26.8%
44044
28.6%
52566
18.2%
61093
 
7.7%
7242
 
1.7%
ValueCountFrequency (%)
7242
 
1.7%
61093
 
7.7%
52566
18.2%
44044
28.6%
33788
26.8%
22322
16.4%
161
 
0.4%

salary
Categorical

Distinct3
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size110.4 KiB
low
6889 
medium
6086 
high
1141 

Length

Max length6
Median length4
Mean length4.374256163
Min length3

Characters and Unicode

Total characters0
Distinct characters0
Distinct categories0 ?
Distinct scripts0 ?
Distinct blocks0 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowlow
2nd rowhigh
3rd rowlow
4th rowlow
5th rowlow

Common Values

ValueCountFrequency (%)
low6889
48.8%
medium6086
43.1%
high1141
 
8.1%

Length

Histogram of lengths of the category

Pie chart

ValueCountFrequency (%)
low6889
48.8%
medium6086
43.1%
high1141
 
8.1%

Most occurring characters

ValueCountFrequency (%)
No values found.

Most occurring categories

ValueCountFrequency (%)
No values found.

Most frequent character per category

Most occurring scripts

ValueCountFrequency (%)
No values found.

Most frequent character per script

Most occurring blocks

ValueCountFrequency (%)
No values found.

Most frequent character per block

satisfaction
Real number (ℝ≥0)

HIGH CORRELATION

Distinct13493
Distinct (%)95.6%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean0.6216535485
Minimum0.0400584
Maximum1
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size110.4 KiB

Quantile statistics

Minimum0.0400584
5-th percentile0.13733525
Q10.452826
median0.6525485
Q30.82296025
95-th percentile0.969316
Maximum1
Range0.9599416
Interquartile range (IQR)0.37013425

Descriptive statistics

Standard deviation0.2491466421
Coefficient of variation (CV)0.4007805355
Kurtosis-0.6424061277
Mean0.6216535485
Median Absolute Deviation (MAD)0.1837245
Skewness-0.481363745
Sum8775.261491
Variance0.06207404925
MonotonicityNot monotonic
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
1356
 
2.5%
0.6525485150
 
1.1%
0.8828922
 
< 0.1%
0.4143752
 
< 0.1%
0.5709542
 
< 0.1%
0.6970192
 
< 0.1%
0.9224572
 
< 0.1%
0.4976122
 
< 0.1%
0.4709552
 
< 0.1%
0.5651652
 
< 0.1%
Other values (13483)13594
96.3%
ValueCountFrequency (%)
0.04005841
< 0.1%
0.04019081
< 0.1%
0.04047741
< 0.1%
0.04130171
< 0.1%
0.04240751
< 0.1%
0.04484411
< 0.1%
0.04558071
< 0.1%
0.04609361
< 0.1%
0.04945921
< 0.1%
0.04954881
< 0.1%
ValueCountFrequency (%)
1356
2.5%
0.999881
 
< 0.1%
0.9997631
 
< 0.1%
0.9997041
 
< 0.1%
0.9995931
 
< 0.1%
0.9995861
 
< 0.1%
0.9995561
 
< 0.1%
0.9995121
 
< 0.1%
0.9994391
 
< 0.1%
0.9993551
 
< 0.1%

tenure
Real number (ℝ≥0)

Distinct8
Distinct (%)0.1%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean3.492419949
Minimum2
Maximum10
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size110.4 KiB

Quantile statistics

Minimum2
5-th percentile2
Q13
median3
Q34
95-th percentile6
Maximum10
Range8
Interquartile range (IQR)1

Descriptive statistics

Standard deviation1.453547798
Coefficient of variation (CV)0.4162007487
Kurtosis4.857579404
Mean3.492419949
Median Absolute Deviation (MAD)1
Skewness1.871919281
Sum49299
Variance2.1128012
MonotonicityNot monotonic
Histogram with fixed size bins (bins=8)
ValueCountFrequency (%)
36156
43.6%
23019
21.4%
42386
 
16.9%
51363
 
9.7%
6659
 
4.7%
10198
 
1.4%
7180
 
1.3%
8155
 
1.1%
ValueCountFrequency (%)
23019
21.4%
36156
43.6%
42386
 
16.9%
51363
 
9.7%
6659
 
4.7%
7180
 
1.3%
8155
 
1.1%
10198
 
1.4%
ValueCountFrequency (%)
10198
 
1.4%
8155
 
1.1%
7180
 
1.3%
6659
 
4.7%
51363
 
9.7%
42386
 
16.9%
36156
43.6%
23019
21.4%

status
Categorical

HIGH CORRELATION

Distinct2
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size110.4 KiB
Employed
10761 
Left
3355 

Length

Max length8
Median length8
Mean length7.049305752
Min length4

Characters and Unicode

Total characters0
Distinct characters0
Distinct categories0 ?
Distinct scripts0 ?
Distinct blocks0 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowEmployed
2nd rowEmployed
3rd rowEmployed
4th rowEmployed
5th rowLeft

Common Values

ValueCountFrequency (%)
Employed10761
76.2%
Left3355
 
23.8%

Length

Histogram of lengths of the category

Pie chart

ValueCountFrequency (%)
employed10761
76.2%
left3355
 
23.8%

Most occurring characters

ValueCountFrequency (%)
No values found.

Most occurring categories

ValueCountFrequency (%)
No values found.

Most frequent character per category

Most occurring scripts

ValueCountFrequency (%)
No values found.

Most frequent character per script

Most occurring blocks

ValueCountFrequency (%)
No values found.

Most frequent character per block

Interactions

Correlations

Spearman's ρ

The Spearman's rank correlation coefficient (ρ) is a measure of monotonic correlation between two variables, and is therefore better in catching nonlinear monotonic correlations than Pearson's r. It's value lies between -1 and +1, -1 indicating total negative monotonic correlation, 0 indicating no monotonic correlation and 1 indicating total positive monotonic correlation.

To calculate ρ for two variables X and Y, one divides the covariance of the rank variables of X and Y by the product of their standard deviations.

Pearson's r

The Pearson's correlation coefficient (r) is a measure of linear correlation between two variables. It's value lies between -1 and +1, -1 indicating total negative linear correlation, 0 indicating no linear correlation and 1 indicating total positive linear correlation. Furthermore, r is invariant under separate changes in location and scale of the two variables, implying that for a linear function the angle to the x-axis does not affect r.

To calculate r for two variables X and Y, one divides the covariance of X and Y by the product of their standard deviations.

Kendall's τ

Similarly to Spearman's rank correlation coefficient, the Kendall rank correlation coefficient (τ) measures ordinal association between two variables. It's value lies between -1 and +1, -1 indicating total negative correlation, 0 indicating no correlation and 1 indicating total positive correlation.

To calculate τ for two variables X and Y, one determines the number of concordant and discordant pairs of observations. τ is given by the number of concordant pairs minus the discordant pairs divided by the total number of pairs.

Cramér's V (φc)

Cramér's V is an association measure for nominal random variables. The coefficient ranges from 0 to 1, with 0 indicating independence and 1 indicating perfect association. The empirical estimators used for Cramér's V have been proved to be biased, even for large samples. We use a bias-corrected measure that has been proposed by Bergsma in 2013 that can be found here.

Phik (φk)

Phik (φk) is a new and practical correlation coefficient that works consistently between categorical, ordinal and interval variables, captures non-linear dependency and reverts to the Pearson correlation coefficient in case of a bivariate normal input distribution. There is extensive documentation available here.

Missing values

A simple visualization of nullity by column.
Nullity matrix is a data-dense display which lets you quickly visually pick out patterns in data completion.

Sample

First rows

df_indexemployee_idagegendermarital_statusavg_monthly_hrsdepartmentlast_evaluationn_projectssalarysatisfactiontenurestatus
0010010126MaleUnmarried156.0D00-SS0.5991092low0.5651002.0Employed
1110010225FemaleUnmarried172.0D00-MN0.7542003high0.48622010.0Employed
2210010324MaleUnmarried268.0D00-ENG0.6823663low0.6125252.0Employed
3310010523FemaleUnmarried192.0D00-MT0.7597113low0.6156413.0Employed
4410010629MaleUnmarried145.0D00-ENG0.5171102low0.5176843.0Left
5510010752MaleMarried178.0D00-ENG0.5009886low0.3652912.0Employed
6610010824MaleUnmarried184.0D00-ENG0.8154773medium0.9243653.0Employed
7710010924MaleUnmarried177.0D00-MT0.4614893high0.3501683.0Employed
8810011025MaleUnmarried235.0D00-ENG0.9463993medium0.6087872.0Employed
9910011122MaleUnmarried138.0D00-ENG0.4892862low0.3697783.0Left

Last rows

df_indexemployee_idagegendermarital_statusavg_monthly_hrsdepartmentlast_evaluationn_projectssalarysatisfactiontenurestatus
141061413514864033FemaleMarried218.0D00-PD0.5362303low0.7545143.0Employed
141071413614871928MaleUnmarried177.0D00-ENG1.0000003medium0.8125092.0Employed
141081413714873731FemaleMarried162.0D00-PR0.5657253low0.8512342.0Employed
141091413814876843MaleMarried232.0D00-FN0.9272035medium0.9029795.0Left
141101413914884242MaleMarried232.0D00-SS0.4369135medium0.6646344.0Employed
141111414014887725MaleUnmarried136.0D00-SS0.6929633high0.7928142.0Employed
141121414114887926MaleUnmarried217.0D00-MT0.7183223high0.7657573.0Employed
141131414214891623FemaleUnmarried171.0D00-ENG0.7183222low0.5838522.0Employed
141141414314894728MaleUnmarried221.0D00-SP0.8403253medium0.7951882.0Employed
141151414414898822MaleUnmarried130.0D00-SP0.5138912medium0.4066453.0Left